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Abstract 

We study the computational complexity of the graph modification problems Threshold Editing 
and Chain Editing, adding and deleting as few edges as possible to transform the input into a 
threshold (or chain) graph. In this article, we show that both problems are NP-hard, resolving a 
conjecture by Natanzon, Shamir, and Sharan (Discrete Applied Mathematics, 113(1):109-128, 2001). 
On the positive side, we show the problem admits a quadratic vertex kernel. Furthermore, we give a 
subexponential time parameterized algorithm solving Threshold Editing in + poly(n) 

time, making it one of relatively few natural problems in this complexity class on general graphs. 
These results are of broader interest to the held of social network analysis, where recent work of 
Brandes (ISAAC, 2014) posits that the minimum edit distance to a threshold graph gives a good 
measure of consistency for node centralities. Finally, we show that all our positive results extend 
to the related problem of Chain Editing, as well as the completion and deletion variants of both 
problems. 


1 Introduction 

In this paper we study the computational complexity of two edge modification problems, namely editing 
to threshold graphs and editing to chain graphs. Graph modification problems ask whether a given 
graph G can be transformed to have a certain property using a small number of edits (such as delet¬ 
ing/adding vertices or edges), and have been the subject of significant previous work [29, 7, 8, 9, 25]. 

In the Threshold Editing problem, we are given as input an n-vertex graph G = {V,E) and a 
non-negative integer k. The objective is to find a set F of at most k pairs of vertices such that G minus 
any edges in F plus all non-edges in T is a threshold graph. A graph is a threshold graph if it can be 
constructed from the empty graph by repeatedly adding either an isolated vertex or a universal vertex [3] . 

Threshold Editing 

Input: A graph G and a non-negative integer k 

Question: Is there a set F C [V]'^ of size at most k such that GAF is a threshold graph. 

The computational complexity of Threshold Editing has repeatedly been stated as open, starting 
from Natanzon et al. [27], and then more recently by Burzyn et al. [4], and again very recently by Liu, 
Wang, Guo and Chen [21]. We resolve this by showing that the problem is indeed NP-hard. 

Theorem 1. Threshold Editing is NP-complete, even on split graphs. 

Graph editing problems are well-motivated by problems arising in the applied sciences, where we 
often have a predicted model from domain knowledge, but observed data fails to fit this model exactly. 
In this setting, edge modification corresponds to correcting false positives (and/or false negatives) to 
obtain data that is consistent with the model. Threshold Editing has specifically been of recent 
interest in the social sciences, where Brandes et al. are using distance to threshold graphs in work on 
axiomatization of centrality measures [2, 28]. More generally, editing to threshold graphs and their close 
relatives chain graphs arises in the study of sparse matrix multiplications [31]. Chain graphs are the 
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Figure 1: Threshold graphs are {C 4 , P 4 , 2 K 2 }-iTee. Chain graphs are bipartite graphs that are 2 K 2 -bee. 


bipartite analogue of threshold graphs (see Definition 2.6), and here we also establish hardness of Chain 
Editing. 

Theorem 2. Chain Editing is NP-complete, even on bipartite graphs. 

Our final complexity result is for Chordal Editing — a problem whose NP-hardness is well-known 
and widely used. This result also follows from our techniques, and as the authors were unable to find a 
proof in the literature, we include this argument for the sake of completeness. 

Having settled the complexity of these problems, we turn to studying ways of dealing with their in¬ 
tractability. Cai’s theorem [5] shows that Threshold Editing and Chain Editing are fixed parameter 
tractable, i.e., solvable in flk) ■ poly(n) time where k is the edit distance from the desired model (graph 
class); However, the lower bounds we prove when showing NP-hardness are on the order of under 

ETH, and thus leave a gap. We show that it is in fact the lower bound which is tight (up to logarithmic 
factors in the exponent) by giving a subexponential time algorithm for both problems. 

Theorem 3 . Threshold Editing and Chain Editing admit -I-poly(n) subexponential 

time algorithms. 

Since our results also hold for the eompletion and deletion variants of both problems (when F is 
restricted to be a set of non-edges or edges, respectively), this also answers a question of Liu et al. [22] 
by giving a subexponential time algorithm for Chain Edge Deletion. 

A crucial first step in our algorithms is to preprocess the instance, reducing to a kernel of size 
polynomial in the parameter. We give quadratic kernels for all three variants (of both Threshold 
Editing and Chain Editing). 

Theorem 4 . Threshold Editing, Threshold Completion, and Threshold Deletion admit 
polynomial kernels with 0 {k‘^) vertices. 

This answers (affirmatively) a recent question of Liu, Wang and Guo [20]— whether the previously 
known kernel, which has O(k^) vertices, for Threshold Completion (equivalently Threshold Dele¬ 
tion) can be improved. 


2 Preliminaries 

Graphs. We will consider only undirected simple finite graphs. For a graph G, let V(G) and F(G) 
denote the vertex set and the edge set of G, respectively. For a vertex v € V(G), by Ng{v) we denote 
the open neighborhood of v, i.e. Ng{v) = {u £ V{G) \ uv £ E{G)}. The closed neighborhood of v, 
denoted by A^g[t], is defined as Ng{v)U{v}. These notions are extended to subsets of vertices as follows: 
A^G[Af] = Ui,ga:-^g[t] and Ng{X) = AfG[A'] \ A. We omit the subscript whenever G is clear from 
context. 

When U C V{G) is a subset of vertices of G, we write G[U] to denote the induced subgraph of G, i.e., 
the graph G' = {U,Eu) where Eu is E{G) restricted to U. The degree of a vertex v £ V(G), denoted 
deg( 3 (u), is the number of vertices it is adjacent to, i.e., degQ(u) = |Ag(u)|. We denote by A(G) the 
maximum degree in the graph, i.e., A(G) = max„gy(G) deg(u). For a set A, we write (^) to denote 
the set of unordered pairs of elements of A; thus E{G) C (^ 2 *^^). By G we denote the complement of 
graph G, i.e., V(G) = V{G) and E(G) = [^(G)]^ \ E{G). 

For two sets A and B we define the symmetric difference of A and B, denoted AAB as the set 
(A \ H) U (B \ A). For a graph G = {V, E) and F C \yY we define GAF as the graph {V, EAF). 

For a graph G and a vertex v we define the true twin class of v, denoted ttc(z;) as the set ttc(w) = 
{u £ V{G) I N[u] = A[z;]}. Similarly, we define the false twin class of v, denoted ftc(u) as the set 
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Figure 2: A threshold partition—the left hand side is the clique and the right hand side is an independent 
set, each bag contains a twin class. All bags are non-empty, otherwise two twin classes on the opposite 
side would collapse into one, except possibly the two extremal bags. 


ftc(v) = {u G V{G) I N{u) =N{v)}. Observe that either ttc(u) = {u} or ftc(u) = {u}. From this 
we define the twin class of v, denoted tc(u) as ttc(u) if |ttc(u)| > |ftc(u)| and ftc{u) otherwise. 

Split and threshold graphs. A split graph is a graph G = (V, E) whose vertex set can be partitioned 
into two sets G and / such that G[G] is a complete graph and G[I] is edgeless, i.e., (7 is a clique and / 
an independent set [3]. For a split graph G we say that a partition (C,I) of V{G) forms a split partition 
of G if G[G] induces a clique and G[I] an independent set. A split partition (C,/) is called a complete 
split partition if for every vertex v G I, N{v) = C. If G admits a complete split partition, we say that 
G is a complete split graph. 

We now give two useful characterizations of threshold graphs: 

Proposition 2.1 ([23]). A graph G is a threshold graph if and only if G has a split partition {G,I) 
such that the neighborhoods of the vertices in I are nested, i.e., for every pair of vertices v and u, either 
N{v) C N[u\ or N{u) C N[v]. 

Proposition 2.2 ([3]). A graph G is a threshold graph if and only if G does not have a G 4 , P 4 nor 
a 2 K 2 as an induced subgraph. Thus, the threshold graphs are exactly the {G 4 , P 4 , 2 Ar 2 }-free graphs (see 
Figure 1). 

Definition 2.3 (Threshold partition, lev(w)). We say that (C,I) = ((Gi,..., Gt), (Ii,..., It)) forms a 
threshold partition of G if the following holds (see Figure 2 for an illustration): 

• (G, I) is a split partition of G, where G = Ui<t G^ and I = 

• Gi and A are twin classes in G for every i 

• N[Gj] C N[Ci] and N{Ii) C N{Ij) for every i < j. 

• Finally, we demand that for every i < t, (Ci,I>i) form a complete split partition of the graph 
induced by Gi U I>i. 

We furthermore define, for every vertex v in G, lev(t)) as the number i such that v G G^ U A and we 
denote each level Li = GiU Ii. 

In a threshold decomposition we will refer to Ci for every f as a clique fragment and A as a independent 
fragment. Furthermore, we will refer to a vertex in UC as a clique vertex and a vertex in UA as an 
independent vertex. 
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Proposition 2.4 (Threshold decomposition). A graph G is a threshold graph if and only if G admits a 
threshold partition. 

Proof. Suppose that G is a threshold graph and therefore admits a nested ordering of the neighborhoods 
of vertices of each side [19]. We show that partitioning the graph into partitions depending only on their 
degree yields the levels of a threshold partition. The clique side is naturally defined as the maximal set 
of highest degree vertices that form a clique. Suppose now for contradiction that this did not constitute 
a threshold partition. By definitions, every level consists of twin classes, and also, for two twin classes 
li and Ij, since their neighborhoods are nested in the threshold graph, their neighborhoods are nested 
in the threshold partition as well. So what is left to verify is that {Gi,I>i) is a complete split partition 
of G[Gi U/>i]. But that follows directly from the assumption that G admitted a nested ordering and Gi 
is a true twin class. 

For the reverse direction, suppose G admits a threshold partition (C,I). Consider any four connected 
vertices a, 6 ,c, d. We will show that they can not form any of the induced obstructions (see Figure 1). 
For the 2 K 2 and G 4 , it is easy to see that at most two of the vertices can be in the clique part of the 
decomposition—and they must be adjacent since it is a clique—and hence there must be an edge in the 
independent set part of the decomposition, which contradicts the assumption that C,I was a threshold 
partition. So suppose now that a, &, c, d forms a P 4 . Again with the same reasoning as above, the middle 
edge 6 ,c must be contained in the clique part, hence a and d must be in the independent set part. But 
since the neighborhoods of a and d should be nested, they cannot have a private neighbor each, hence 
either ac or bd must be an edge, which contradicts the assumption that a,b,c,d induced a P 4 . This 
concludes the proof. □ 

Lemma 2.5. For every instance (G, k) of Threshold Editing or Threshold Completion it holds 
that there exists an optimal solution F such that for every pair of vertices u,v € V{G), if Ng{u) C A’g[u] 
then Ngaf{u) C A^gaf[t]- 

Proof. Let us define, for any editing set F and two vertices u and v, the set 

= {e I e' G F and e is e' with u and v switched}. 

Suppose F is an optimal solution for which the above statement does not hold. Then Ng{u) C A^( 3 [u] and 
Ngaf{v) C A^gafM (see Proposition 2.1). But then it is easy to see that we can flip edges in an ordering- 
such that at some point, say after flipping F^, u and v are twins in this intermediate graph GAF^. Let 
F^ = F \ F°. It is clear that for G' = GA(F° U Ng>{u) C Since |F| > |F° U the 

claim holds. □ 

Chain graphs. Chain graphs are the bipartite graphs whose neighborhoods of the vertices on one 
of the sides form an inclusion chain. It follows that the neighborhoods on the opposite side form an 
inclusion chain as well. If this is the case, we say that the neighborhoods are nested. The relation to 
threshold graphs is obvious, see Figure 3 for a comparison. The problem of completing edges to obtain a 
chain graph was introduced by Golumbic [16] and later studied by Yannakakis [31], Feder, Mannila and 
Terzi [12] and finally by Fomin and Villanger [14] who showed that Chain Completion when given a 
bipartite graph whose bipartition must be respected is solvable in subexponential time. 

Definition 2.6 (Chain graph). A bipartite graph G = {A, B, E) is a chain graph if there is an ordering 
of the vertices of A, ai, 02 ,..., a\A\ such that N{ai) C N{a 2 ) C • • • C A^(a|A|). 

From the following proposition, it follows that chain graphs are characterized by a finite set of 
forbidden induced subgraphs and hence are subject to Cai’s theorem [5]. 

Proposition 2.7 ([3]). Let G be a graph. The following are equivalent: 

• G is a chain graph. 

• G is bipartite and 2K2-free. 

• G is {2K2,G3,G5}-free. 

• G can be constructed from a threshold graph by removing all the edges in the clique partition. 
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(a) A chain graph 


(b) A threshold graph 


Figure 3: Illustration of the similarities between chain and threshold graphs. Note that the nodes drawn 
can be replaced by twin classes of any size, even empty. However, if on one side of a level there is an 
empty class, the other two levels on the opposite side will collapse to a twin class. See Proposition 2.4. 

Since they have the same structure as threshold graphs, it is natural to talk about a chain decomposi¬ 
tion, {A,B) of a bipartite graph G with bipartition {A,B). We say that {A,B) is a chain decomposition 
for a chain graph G if and only if {A, B) is a threshold decomposition for the corresponding threshold 
graph G' where A is made into a clique. 

Parameterized complexity. The running time of an algorithm in classical complexity analysis is 
described as a function of the length of the input. To refine the analysis of computationally hard 
problems, especially NP-hard problems, parameterized complexity introduced the notion of an extra 
“parameter”—an additional part of a problem instance used to measure the problem complexity when 
the parameter is taken into consideration. To simplify the notation, here we consider inputs to problems 
to be of the form (G, k) —a pair consisting of a graph G and a nonnegative integer k. We will say 
that a problem is fixed parameter tractable whenever there is an algorithm solving the problem in time 
f{k) ■ poly(|G|), where / is any function, and poly: N —>■ N any polynomial function. In the case when 
f{k) = we say that the algorithm is a subexponential parameterized algorithm. When a problem 
n C 5 X N is fixed-parameter tractable, where Q is the class of all graphs, we say that H belongs to 
the complexity class FPT. For a more rigorous introduction to parameterized complexity we refer to the 
book of Flum and Grohe [13]. 

Given a parameterized problem H, we say two instances (G, k) and (G', k') are equivalent if (G, /c) £ H 
if and only if (G', k') £ H. A kernelization algorithm (or kernel) is a polynomial-time algorithm for a 
parameterized problem H that takes as input a problem instance (G, k) and returns an equivalent instance 
(G', k'), where both |G'| and k' are bounded by f{k) for some function /. We then say that / is the size 
of the kernel. When k' < k, we say that the kernel is a proper kernel. Specifically, a proper polynomial 
kernelization algorithm for H is a polynomial time algorithm which takes as input an instance (G, k) and 
returns an equivalent instance (G', k') with k' < k and |G'| < p{k) for some polynomial function p. 

Definition 2.8 (Laminar set system, [11]). A set system T G2F over a ground set U is called laminar 
if for every Xi and X 2 in T with a:i £ Ai \ X 2 and X 2 £ X 2 \ Xi, there is no T £ with {xi,X 2 } Q Y. 

An equivalent way of looking at a laminar set system X is that every two sets Xi and X 2 in X are 
either disjoint or nested, that is, for every Ai, A 2 £ either XiH X 2 = 0, or Xi C X 2 or X 2 C Xi. 

Lemma 2.9 ([11]). Let X he a laminar set system over a finite ground set U. Then the cardinality of J- 
is at most \U\ -\- 1. 

3 Hardness 

In this section we show that Threshold Editing is NP-complete. Recalling (see Figure 3) that chain 
graphs are bipartite graphs with structure very similar to that of threshold graphs, it should not be 
surprising that we obtain as a corollary that Chain Editing is NP-complete as well. 
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Figure 4: The connections of a clause and a variable. All the vertices on the top (the variable vertices) 
belong to the clique, while the vertices on the bottom (the clause vertices) belong to the independent 
set. The vertices in the left part of the clique has higher degree than the vertices of the right part of 
the clique, whereas all the clause vertices (in the independent set) will all have the same degree, namely 
3-|V^|. 


We will also conclude the section by giving a proof for the fact that Chordal Editing is NP-complete; 
Although this has been known for a long time (Natanzon [26], Natanzon et al. [27], Sharan [30]), the 
authors were unable to find a proof in the literature for the N P-completeness of Chordal Editing and 
therefore include the observation. The problem was recently shown to be FPT by Cao and Marx [6], 
however we would like to point out that the more general problem studied there is indeed well-known to 
be NP-complete as it is a generalized version of Chordal Vertex Deletion. 


3.1 NP-completeness of Threshold Editing 

Recall that a boolean formula is in 3-CNF-SAT if it is in conjunctive normal form and each clause 
has at most three variables. Our hardness reduction is from the problem 3Sat, where we are given a 
3-CNF-SAT formula ip and asked to decide whether p admits a satisfying assignment. We will denote 
by C^p the set of clauses, and by Vip the set of variables in a given 3-CNF-SAT formula p. An assignment 
for a formula p is a, function a: ^ {true, false}. Furthermore, we assume we have some natural 

lexicographical ordering <iex of the clauses ci,... ,c\c^\ and the same for the variables ui,..., U|v^|, 
hence we may write, for some variables x and y, that x KiexV- To immediately get an impression of the 
reduction we aim for, the construction is depicted in Figure 4. 


3.1.1 Construction 

Recall that we want to form a graph G^p and pick an integer kp so that (Gtp^ktp) is a yes-instance of 
Threshold Editing if and only if p is satisfiable. We will design to be a split graph, so that 

the split partition is forced to be maintained in any threshold graph within distance k^ of G^p, where 

kip = \Cip\ ■ (dlVysl ~ !)■ 

Given p, we first create a clique of size 6|Vipj; To each variable x € we associate six vertices of 
this clique, and order them in the following manner 




We will throughout the reduction refer to this ordering as tt^: TTp is a partial order which has 

< <7^^ < <7r^ Vt,v 1 Vl 

and for every two vertex uj and u* with x <iex y, we have < 7 r^ u*. Observe that we do not specify 
which comes first of and —this is the choice that will result in the assignment a for p. 

We enforce this ordering by adding 0(k^) vertices in the independent set; Enforcing that vi comes 
before V 2 in the ordering is done by adding kp + 1 vertices in the independent set incident to all the 
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3|V| + 1 



Figure 5: The cost with which we charge a clause vertex depends on the cut-ofF point; The a;-axis denotes 
the point in the lexicographic ordering which separates the vertices adjacent to the clause vertex from 
the vertices not adjacent to the clause vertex. 


vertices coming before vi, including vi. Since swapping the position of vi and V2 would demand at 
least + 1 edge modifications and is the intended budget, in any yes instance, vi ends up before V 2 
in the ordering of the clique. 

We proceed adding the clause gadgets; For every clause c G C<^, we add one vertex Vc to the indepen¬ 
dent set. Hence, the size of the independent set is 0(|C<p| + For a variable x occurring in c, we add 
an edge between Vc and if it occurs negatively, and between Vc and otherwise. In addition, we 
make Vc incident to and v^. 

For a variable 2 ; which does not occur in a clause c, we make Vc adjacent to and v^. To 

complete the reduction, we add A{k^ + 1) isolated vertices; k^-\-l vertices to the left in the independent 
set, k^ + \ vertices to the right in the independent set, and k^p -\-\ to the left and -|- 1 to the right in 
the clique. This ensures that no vertex will move from the clique to the independent set partition, and 
vice versa. 

3.1.2 Properties of the Constructed Instance 

Before proving the Theorem 1, and specifically Lemma 3.4, we may observe the following, which may 
serve as an intuition for the idea of the reduction. When we consider a fixed permutation of the variable 
gadget vertices (the clique side), the only thing we need to determine for a clause vertex Wc, is the 
cut-ojf point: the point in at which the vertex Vc will no longer have any neighbors. Observing that 
no vertex vf swaps places with any other uj for i,j G {a,b,c,d}, and that no changes with u* for 
G Vp,, consider a fixed permutation of the variable vertices. We charge the clause vertices with the 
edits incident to the clause vertex. Since the budget is k^, = \C\ ■ (3|V^| — 1), and every clause needs at 
least 3|V<^| — 1, to obtain a solution (upcoming Lemma 3.2) we need to charge every clause vertex with 
exactly 3|V^| — 1 edits. Figure 5 illustrates the charged cost of a clause vertex. 

Observation 3.1. The graph resulting from the above procedure is a split graph and when k^, = 
\C\ ■ (3|V<^| — 1), if H is a threshold graph within distance kip of Gip, H must have the same clique- 
maximizing split partition as Gp . 

Lemma 3.2. Let { Gip ^ kip ) be a yes instance to Threshold Editing constructed from a 3-CNF-SAT 
formula ip with |F| < ktp a solution. For any clause vertex Vc, at least 3|V,^| — 1 edges in F are incident 
to Vc- 

Proof. By the properties of TTp, we know that the only vertices we may change the order of are those 
corresponding to v\ and Pick any index in for which we know that Vc is adjacent to all vertices on 
the left hand side and non-adjacent to all vertices on the right hand side. Let be the set of variables 
whose vertices are completely adjacent to Vc and Rc the corresponding set completely non-adjacent 
to Vc. By construction, Vc has exactly three neighbors in each variable and thus these variable gadgets 
contribute 3(|Lc| + |.Rc|) to the budget. If LcA Rc = Vp, we are done, as Vc needs at least 3|Vc^| edits 
here. 
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Figure 6 : The edited version when y satisfies ci. We have added three edges to the gadget x and deleted 
three edges to the gadget z, and added the edge to and deleted the edge to that is, we have edited 
exactly 3-2 + 2 = 3(|V| — 1) + 2 = 3|V| — 1 edges incident to ci. Notice that if v\ was coming before 
we would have to choose a different variable to satisfy ci. 


Suppose therefore that there is a variable x whose vertex is adjacent to Vc and is non-adjacent 
to Vc- But then we have already deleted the existing edge VcV^ and added the non-existing edge VcV^. 
This immediately gives a lower bound on 3(|V,^| — 1) + 2 = 3|V,^| — 1 edits. □ 

3.1.3 Proof of Correctness 

Lemma 3.3. If there is an editing set F of size at most k^p for an instance {Gp,kip) constructed from a 
3-CNF-SAT formula ip, and |P(uc)| = 3|V,^| — 1, then the <\e^-highest vertex connected to Vc corresponds 
to a variable satisfying the clause c. 

Proof. From the proof of Lemma 3.2, we observed that for a clause c to be within budget, we must choose 
a cut-off point within a variable gadget, meaning that there is a variable x for which Vc is adjacent to 
and non-adjacent to v^. 

We now distinguish two cases, (i) x is a variable occurring (w.l.o.g. positively) in c and (ii) x does 
not occur in c. For (i), Vc was adjacent to vf, vf-, and v^. By assumption, we add the edge to and 
delete the edge to v^. But then we have already spent the entire budget, hence the only way this is a 
legal editing, vf- must come before , and hence satisfies Vc- See Figure 6. 

For (ii) we have that Vc was adjacent to vf, vf, and v)). Here we, again by assumption, add the edge 
to and delete the edge to v^- This alone costs two edits, so we are done. But observe that these two 
edits alone are not enough, hence if we want to achieve the goal of 3|V<^| — 1 edited edges, the cut-off 
index must be inside a variable gadget corresponding to a variable occurring in c, i.e. (i) must be the 
case. □ 

Lemma 3.4. A 3-CNF-SAT formula p is satisfiable if and only if {Gp, k^) is a yes instance to Thresh¬ 
old Editing. 

Proof of Lemma 3-4- For the forwards direction, let phea satisfiable 3-CNF-SAT formula where a: —>■ 
{true, false} is any satisfying assignment, and {G^, k^) the Threshold Editing instance as described 
above. 

Now, let a: Vip —>■ (true,false} be a satisfying assignment, and {Gp^kp) the Threshold Editing 
instance as described above, and let tt be any permutation of the vertices of the clique side with the 
following properties 

• for every x <iex y G we have vf <t^ u*, 

• for every x G V^,, we have u® <,r <n fy < < 7 r fy < vf and 

finally 


• for every x G Vc^, we have <,r if and only if a{x) = false. 







We now show how to construct the threshold graph from the constructed graph G^p by editing 
exactly = \C\ ■ (3|V^| — 1) edges. For a clause c, let x be any variable satisfying c. If x appears 
positively, add every non-existing edge from Vc to every vertex v < 7 r and delete all the rest. If x 
appears negated, use instead. We break the remainder of the proof in the forward direction into two 
claims: 

Claim 3.5. is a threshold graph. 

Proof of Claim 3.5. Let G^, and tt be given, both adhering to the above construction. Since G^ was a 
split graph, tt a total ordering of the elements in the independent set part and every vertex of the clique 
part of sees a prefix of the vertices of the independent set, their neighborhoods are naturally nested. 
Hence is a threshold graph by Proposition 2.1. □ 

Claim 3.6. \E{Gp)l\E{H^)\ = k^. 

Proof of Claim 3.6. Since we did not edit any of the edges within the clique part nor the independent set 
part, we only need to count the number of edits going between a clause vertex and the variable vertices. 
Let c be any clause and x the lexicographically smallest variable satisfying c. Suppose furthermore, 
without loss of generality, that x appears positively in c and has thus a{x) — true. We now show that 
|F(z)c)| = 3|Vc^| — 1, and since c was arbitrary, this concludes the proof of the claim. Since Vc is adjacent 
to exactly three vertices per variable, and non-adjacent to exactly three vertices per variable, we added 
all the edges to the vertices appearing before x and removed all the edges to the vertices appearing after x. 
This cost exactly 3(|— 1) = 3|V,^| — 3, hence we have two edges left in our budget for c. Moreover, the 
edge Vcvf^ was added and the edge VcV^ was deleted. Now, c is adjacent to every vertex to the before, and 
including, x, and non-adjacent to all the vertices after x. The budget used was 3(|Vi^| — l)-|-2 = 3|Vc^| — 1. 
Hence, the total number of edges edited to obtain is J2ceC ~ 1 = 1^1 ’ (3|V^| — 1) = k^. □ 

This shows that if ip is satisfiable, then (G,^, k^f) is a yes-instance of Threshold Editing. 

In the reverse direction, let {Gp,kp) be a constructed instance from a given 3-CNF-SAT formula p 
and let F be a minimal editing set such that G^AE is a threshold graph and |F| < k^. We aim to 
construct a satisfying assignment a: t {true, false} from G^AE. By Observation 3.1, H = G^AF 

has the same split partition as G^. By construction, we have enforced the ordering, -k^, of each of the 
vertices corresponding to the variables. Thus, we know exactly how H looks, with the exception of the 
internal ordering of each literal and its negation. Construct the assignment a as described above, i.e., 
a(x) = false if and only if < 7 r vf-. 

By Lemmata 3.2 and 3.3, it follows directly that a is a satisfying assignment for p which concludes 
the proof of the main lemma. □ 


The above lemma shows that there is a polynomial time many-one (Karp) reduction from 3Sat 
to Threshold Editing so we may wrap up the main theorem of this section. Lemma 3.4 implies 
Theorem 1, that Threshold Editing is NP-complete, even on split graphs. 

For the sake of the next section, devoted to the proof of Theorem 2, we define the following annotated 
version of editing to threshold graphs. In this problem, we are given a split graph and we are asked to 
edit the graph to a threshold graph while respecting the split partition. 


Split Threshold Editing 

Input: A split graph G = {V,E) with split partition {G,I), and an integer k. 

Question: Is there an editing set F CC x I oi size at most k such that GAF is a threshold graph? 


Corollary 3.7. Split Threshold Editing is NP-complete. 

Proof. Split Threshold Editing is clearly in NP and that the problem is NP-complete follows imme¬ 
diately from combining Lemma 3.4 with Observation 3.1. □ 

Corollary 3.8. Assuming ETH, neither Threshold Editing nor Split Threshold Editing are 
solvable in ■ poly(n) time. 
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3.2 NP-hardness of Chain Editing and Chordal Editing 

3.2.1 Chain Graphs: Proof of Theorem 2 

A bipartite graph G = {A, B, E) is a chain graph if the neighborhoods of A are nested (which necessarily 
implies the neighborhoods of B are nested as well). Recalling Proposition 2.7, chain graphs are closely 
related to threshold graphs; Given a bipartite graph G = {A, B, E), if one replaces A (or B) by a clique, 
the resulting graph is a threshold graph if and only if G was a chain graph. 

It immediately follows from the above exposition that the following problem is NP-complete. This 
problem has also been referred to as Chain Editing in the literature (for instance in the work by 
Guo [17]). 

Bipartite Chain Editing 

Input: A bipartite graph G = {A, B, E) and an integer k 

Question: Does there exist a set E C A x B of size at most k such that GAF is a chain graph? 

Observe that we in this problem are given a bipartite graph together with a bipartition, and we are 
asked to respect the bipartition in the editing set. 

Corollary 3.9. The problem Bipartite Chain Editing is HP-complete. 

Proof. We reduce from Split Threshold Editing. Recall that to this problem, we are given a split 
graph G = (V, E) with split partition (C, I), and an integer fc, and asked whether there is an editing set 
F C C X J of size at most k such that GAF is a threshold graph. Since a chain graph is a threshold 
graph with the edges in the clique partition removed (Proposition 2.7), it follows that GAF with all the 
edges in the clique partition removed is a chain graph. 

Let (G, k) be the input to Split Threshold Editing and let (G, I) be the split partition. Remove 
all the edges in G to obtain a bipartite graph G' = (A, B, E'). Now it follows directly from Proposition 2.7 
that (G, k) is a yes instance to Split Threshold Editing if and only if (G', k) is a yes instance to 
Bipartite Chain Editing. □ 


Chain Editing 

Input: A graph G = (V, E) and a non-negative integer k 

Question: Is there a set F of size at most k such that GAF is a chain graph? 

We now aim to prove Theorem 2, that Chain Editing is NP-complete. 

Proof of Theorem 2. Reduction from Bipartite Chain Editing. Let G = (A, B, E) be a bipartite 
graph and consider the input instance (G, k) to Bipartite Chain Editing. We now show that 
adding 2{k 1) new edges to G to obtain a graph G' = (E, F'), gives us that (G', k) is a yes instance for 

Chain Editing if and only if (G, k) is a yes instance for Bipartite Chain Editing. 

Let G = (A, B, E) be a bipartite graph and k a positive integer. Add A: -|- 1 new vertices ai, • • • a^+i 
to A and make them universal to B, and add fc-f 1 new vertices 6i, • • • bk+i to B and make them universal 
to A. Call the resulting graph G' = (E, F'). 

The following claim follows immediately from the construction. 

Claim 3.10. If G'AF is a chain graph with |F| < k, then G'AF has hipartition (Aujoi, ..., Ofe+i}, FU 

{6i,..., 6fc+i}). 

It follows that for any input instance {G,k) to Bipartite Chain Editing, the instance [G',k) 
as constructed above is a yes instance for Chain Editing if and only if (G, k) is a yes instance for 
Bipartite Chain Editing. □ 

Corollary 3.11. Assuming ETH, there is no algorithm solving neither Chain Editing nor Bipartite 
Chain Editing in time ■ poly(n). 

Proof. In both these cases we reduced from Split Threshold Editing without changing the parame¬ 
ter k. Hence this follows immediately from the above exposition and from Corollary 3.8. □ 
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3.2.2 Chordal Graphs 


We will now combine our previous result on Chain Editing with the following observation of Yan- 
nakakis to prove Theorem 5. Yannakakis showed [31], while proving the N P-completeness of Chordal 
Completion (more often known as Minimum Fill-In [14]), that a bipartite graph can be transformed 
into a chain graph by adding at most k edges if and only if the cobipartite graph formed by completing 
the two sides can be transformed into a chordal graph by adding at most k edges. 

Theorem 5. Chordal Editing is NP - hard . 

To prove the theorem, we will first give an intermediate problem that makes the proof simpler. Let 
G = (A, B, E) be a cobipartite graph. Define the problem Cobipartite Chordal Editing to be the 
problem which on input (G, k) asks if we can edit at most k edges between A and B, i.e., does there exist 
an editing set F C A x B of size at most fc, such that GAF is a chordal graph. That is, Cobipartite 
Chordal Editing asks for the bipartition A, B to be respected. 


Cobipartite Chordal Editing 

Input: A cobipartite graph G = {A, B, E) and an integer k 

Question: Does there exist a set F C A x B of size at most k such that GAF is a chordal graph? 

We will use the following observation to prove the above theorem: 

Lemma 3.12. If G = {A,B,E) is a bipartite graph, and G' = {A,B,E') is the cobipartite graph 
constructed from G by completing A and B, then F is an optimal edge editing set for Bipartite Chain 
Editing on input (G, k) if and only if F is an optimal edge editing set for Cobipartite Chordal 
Editing on input {G',k). 

Proof. Let F be an optimal editing set for Bipartite Chain Editing on input (G, k) and suppose that 
G'AF has an induced cycle of length at least four. Since G' is cobipartite, it has a cycle of length exactly 
four. Let 0161620201 be this cycle. But then it is clear that 0161,0262 forms an induced 2 K 2 in GAF, 
contradicting the assumption that F was an editing set. 

For the reverse direction, suppose F is an optimal edge editing set for Cobipartite Chordal 
Editing on input (G', k) only editing edges between A and B. Suppose for the sake of a contradiction 
that GAF was not a chain graph. Since F only goes between A and B, GAF is bipartite and hence 
by the assumption must have an induced 2 X 2 - This obstruction must be on the form 0161 , 0262 , but 
then 0161620201 is an induced G 4 in G'AF which is a contradiction to the assumption that G'AF was 
chordal. Hence GAF is a chain graph. □ 

Corollary 3.13. Cobipartite Chordal Editing is NP-complete. 

We are now ready to prove Theorem 5. 

Proof of Theorem 5. Let (G = {A, B, E), k) be a cobipartite graph as input to Cobipartite Chordal 
Editing. Our reduction is as follows. Create G' = {A' U B',E') as follows: 

• A'= AU { 01 , 02 ,... ,afe+i}, 

• = i? U { 61 , 62 ,..., bk+i}, 

• E = E U {,^^,{ 0 ^ 6 } U lJj^^<^_i_j^{aiOj, 6 i 6 j} 

Finally, we create G" as follows. For every edge OiOj create k + 1 new vertices adjacent to only Oi and 
Oj. Do the same thing for every edge bibj. This forces none of the edges in A' to be removed and none 
of the edges in B' to be removed. 

Claim 3.14. The instance of Chordal Editing {G”,k) is equivalent to the instance {G,k) to Cobi¬ 
partite Chordal Editing. 

Proof of claim. The proof of the above claim is straight-forward. If we delete an edge within A (resp. B), 
we create at least k -\-l cycles of length 4, each of which uses at least one edge to delete, hence in any 
yes instance, we do not edit edges within A (resp. B). Furthermore, any chordal graph remains chordal 
when adding a simplicial vertex, which is exactly what the k -\-1 new vertices are. □ 


11 



From the claim it follows that (G", k) is a yes instance to Chordal Editing if and only if (G, k) is a 
yes instance to Cobipartite Chordal Editing. The theorem follows immediately from Corollary 3.13. 

□ 

Corollary 3.15. Assuming ETH, there is no algorithm solving Chordal Editing in time ■ 

poly(n). 

4 Kernels for Modifications into Threshold and Chain Graphs 

First we give kernels with quadratically many vertices for the following three problems: Threshold 
Completion, Threshold Deletion, and Threshold Editing, answering a recent question of Liu, 
Wang and Guo [20]. Then we continue by providing kernels with quadratically many vertices for Chain 
Completion, Chain Deletion, and Chain Editing. Our kernelization algorithms uses techniques 
similar to the previous result that Trivially Perfect Editing admits a polynomial kernel [11]. Ob¬ 
serve that the class of threshold graphs is closed under taking complements. It follows that for every 
instance (G, k) of Threshold Completion, (G, k) is an equivalent instance of Threshold Dele¬ 
tion (and vice versa). Almost the same trick applies to Chain Deletion. Due to this, we restrict 
our attention to the completion and editing variants for the remainder of the section. Motivated by 
the characterization of threshold graphs in Propositions 2.2 and 2.7, we define obstructions (also see 
Figure 1). 

Definition 4.1 {TL, Obstruction). A graph iJ is a threshold obstruction if it is isomorphic to a member 
of the set {G 4 , P 4 , 2 K 2 } and a chain obstruction if it is isomorphic to a member of the set {G 3 , 2 K 2 , G 5 }. 
If it is clear from the context, we will often use the term obstruction for both threshold and chain 
obstructions and denote the set of obstructions by H. Furthermore, if an obstruction H is an induced 
subgraph of a graph G we call H an obstruction in G. 

Definition 4.2 (Realizing). For a graph G and a set of vertices X C V{G) we say that a vertex 
V S V{G) \ A is realizing Y C X ii Nx{v) = Y. Furthermore, we say that a set T C A is being realized 
if there is a vertex v & V (G) \ A such that v is realizing Y. 

Before proceeding, we observe that our kernelization algorithms does not modify any edges, and only 
changes the budget in the case that we discover that we have a no-instance (in which case we return (77, 0 ), 
where H is an obstruction in G). The only modification of the instance is to delete vertices, hence the 
kernelized instance is an induced subgraph of the original graph. Since the parameter is never increased, 
we obtain proper kernels. 

4.1 Modifications into Threshold Graphs 

We now focus on modifications to threshold graphs and obtaining kernels for these operations. 

4.1.1 Outline of the Kernelization Algorithm 

The kernelization algorithm consists of a twin reduction rule and an irrelevant vertex rule. The twin 
reduction rule is based on the observation that any obstruction containing vertices from a large enough 
twin class will have to be handled by edges not incident to the twin class. From this observation, we may 
conclude that for any twin class, we may keep only a certain amount without affecting the solutions. 

A key concept of the irrelevant vertex rule is what will be referred to as a threshold-modulator. A 
threshold-modulator is a set of vertices A in G of linear size in k, such that for every obstruction H in G 
one can add and remove edges in [A]^ to turn H into a non-obstruction. First, we prove that we can 
in polynomial time either obtain such a set A or conclude correctly that the instance is a no-instance. 
The observation that G — A is a threshold graph will be exploited heavily and we now fix a threshold 
decomposition (C,I) of G — A. We then prove that the idea of Proposition 2.1 can be extended to 
vertices in G — A when considering their neighborhoods in G. In other words, the neighborhoods of 
the vertices in G — A are nested also when considering G. This immediately yields that the number of 
subsets of A that are being realized is bounded linearly in the size of A and hence also in k. 
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We now either conclude that the graph is small or we identify a sequence of levels in the threshold 
decomposition containing many vertices, such that all the clique vertices and all the independent set 
vertices in the sequence have identical neighborhoods in X, respectively. The crux is that in the middle 
of such a sequence there will be a vertex that is replaceable by other vertices in every obstruction and 
hence is irrelevant. Such a sequence is obtained by discarding all levels in the decomposition that are 
extremal with respect to a subset Y oi X, meaning that there either are no levels above or underneath 
that contain vertices realizing Y. One can prove that in this process, only a quadratic number of vertices 
are discarded and from this we obtain a kernel. 

4.1.2 The Twin Reduction Rule 

First, we introduce the twin reduction rule as described above. For the remainder of the section we will 
assume this rule to be applied exhaustively and hence we can assume all twin classes to be small. 

Rule 1 (Twin reduction rule). Let {G,k) be an instance of Threshold Completion or Threshold 
Editing and v a vertex in G such that \ tc(i;)| > 2fc + 2. We then reduce the instance to (G — v,k). 

Lemma 4.3. Let G be a graph and v a vertex in G such that \ tc(v)| > 2A: + 2. Then for every k we 
have that {G,k) is a yes-instance of Threshold Completion (or Threshold Editing) if and only 
if {G — v,k) is a yes-instance of Threshold Completion (resp. Threshold Editing). 

Proof. For readability we only consider Threshold Completion, however the exact same proof works 
for Threshold Editing. Let G' = G — v. It trivially holds that if (G, k) is a yes-instance, then 
also {G', k) is a yes-instance. This is due to the fact that removing a vertex never will create new 
obstructions. 

Now, let (G', k) be a yes-instance and assume for a contradiction that (G, k) is a no-instance. Let F 
be an optimal solution of (G', k) and W an obstruction in {GAF, k). Since W is not an obstruction in G' 
it follows immediately that v is in W. Furthermore, since |F| < A: it follows that there are two vertices 
a,b G tc(v) \ {w} that F is not incident to. Also, one can observe that no obstruction contains more than 
two vertices from a twin class and hence we can assume without loss of generality that b is not in W. It 
follows that NoApiv) H {W — v) = Ng{v) fl {W — v) = Naib) fl {W — v) = Ncib) fl (IF — v) and hence 
the graph induced on V{W)A{b,v} is an obstruction in G'AF, contradicting that F is a solution. □ 

4.1.3 The Modulator 

To obtain an 0(k‘^) kernel we aim at an irrelevant vertex rule. However, this requires some tools. The 
first one is the concept of a threshold-modulator, as defined below. 

Definition 4.4 (Threshold modulator). Let G be a graph and X C F(G) a set of vertices. We say 
that A is a threshold-modulator of G if for every obstruction IF in G it holds that there is a set of 
edges F in [A]^ such that WAF is not an obstruction. 

Less formally, a set A is a threshold-modulator of a graph G if for every obstruction IF in G you 
can edit edges between vertices in A to turn IF into a non-obstruction. Our kernelization algorithm will 
heavily depend on finding a small threshold-modulator A and the fact that G — A is a threshold graph. 

Lemma 4.5. There is a polynomial time algorithm that given a graph G and an integer k either 

• outputs a threshold-modulator X of G such that |A| < 4fc or 

• correctly concludes that (G, k) is a no-instance of both Threshold Completion and Threshold 
Editing. 

Proof. Let Ai be the empty set and >V = {IFi,..., IFt} the set of all obstructions in G. We execute 
the following procedure for every IFi in VV: If IFiAF is an obstruction for every F C [Ai fl F(IFi)]^ we 
let Ai+i = Ai U F(IFi), otherwise we let Ai+i = Ai. After we have considered all obstructions we let 
A = Ai_|_i. If |A| > 4fc we conclude that (G, k) is a no-instance, otherwise we output A. 

Since all obstructions are finite the algorithm described clearly runs in polynomial time. We now 
argue that A is a threshold-modulator of G. If IFi was added to Ai+i, we let F be all the non-edges 
of IF. Since WAF is isomorphic to it follows immediately that IFAF is not an obstruction. If IFi 


13 








H, 


H2 


H. 


Hi 


H. 


Ha 


Hr 


Figure 7: Some of the intersections of an obstruction with a threshold-modulator X that will not occur 
by definition. More specifically the ones necessary for the proof of the kernel. 


was not added to ^i+i, let F the set found in [Xi fl V{Wi)Y such that WiAF is not an obstruction. 
Observe that F C [X]^ and hence X is a threshold-modulator. 

It remains to prove that if |X| > 4A; then {G,k) is a no-instance of Threshold Editing. Observe 
that it will follow immediately that (G, k) is a no-instance of Threshold Completion. Since every 
obstruction consists of four vertices there was at least fc -|- 1 obstructions added during the procedure. 
Assume without loss of generality that Wi, ..., Wk+i was added. Observe that by construction, a solution 
must contain an edge in [X^+i — X^]^ for every i G [fc + 1] and hence contains at least fc -I- 1 edges. □ 


4.1.4 Obtaining Structure 

We now exploit the threshold-modulator and its interaction with the remaining graph to obtain structure. 
First, we prove that the neighborhoods of the vertices outside of X are nested and that the number of 
realized sets in X are bounded linearly in fc. 

Lemma 4.6. Let G be a graph and X a threshold-modulator. For every pair of vertices u and v in G — X 
it holds that either N{u) C X[u] or N{v) C X[it]. 

Proof. Assume otherwise for a contradiction and let u' be a vertex in N(u) \ X[z;] and v' a vertex in 
N{v) \ X[m]. Let W = G[{u, v, u', u'}] and observe that uu' and vv' are edges in W and uv' and vu' are 
non-edges in W by definition. Hence, no matter if some of the edges uv and u'v' are present or not, W 
is an obstruction in G (see Figure 7 for an illustration). Since u'v' is the only pair in W possibly with 
both elements in X this contradicts X being a threshold-modulator. □ 

Lemma 4.7. Let G be a graph and X a corresponding threshold-modulator, then 

|{Xx(u) for V G V{G) \ X}| < |X| + 1. 

Or in other words, there are at most |X| -|- 1 sets of X that are being realized. 

Proof. Let u and v be two vertices in G — X. It follows directly from Lemma 4.6 that either Nx{v) C 
Nx{u) or Nx{v) A Nx{u). The result follows immediately. □ 

With the definition of the modulator and the basic properties above, we are now ready to extract 
more vertices from the instance, aiming at many consecutive levels that have the same neighborhood 
in X for the clique, and independent set vertices, respectively. This will lead up to our irrelevant vertex 
rule. 

Let G be a graph, X a threshold-modulator and (C,I) a threshold partition of G — X. Letting P 
denote either G or /, we say that a subset T C X has its upper extreme in Pi if Pi realizes Y and for 
every j > i it holds that Pj does not realize Y. Similarly, a subset T C X has its lower extreme in Pi 
if Pi realizes Y and for every j < i it holds that Pj does not realize Y. We say that T C X is extremal 
in Pi if Y has its upper or lower extreme in Y. Observe that every T C X is extremal in at most two 
clique fragments and two independent set fragments. 

We continue having P denote either G or I. 
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Lemma 4.8. Let G he a graph, X a threshold-modulator and (C,I) a threshold partition of G — X. For 
every Y C X it holds that if Y has its lower extreme in Pi and upper extreme in Pu, then for every 
vertex v € Pi with i € [£ -\- l,u — 1] it holds that Nx{v) = Y. 

Proof. Let y be a subset of X with Gi and Gu being its lower and upper extremes in the clique respectively. 
By definition there is a vertex u £ Gi and a vertex w £ Gu such that Nx{u) = Nx{w) = Y. Let i be 
an integer in [£ + 1 ,m — 1] and a vertex v £ Gi. By the definition of a threshold partition it holds that 
Nc-x{w) C Nc-x{v) C Nc-x{u). It follows from Lemma 4.6 that N{w) C and that N{v) C 
Hence, 

Y = Nxiw) C Nxiv) C Nxiu) = Y 

and we conclude that Nx{v) = Y. Since i and v was arbitrary, the proof is complete. □ 

Definition 4.9 (Important, Outlying, and Regular). We say that Pi in the partition is important if 
there is a y C X such that Y has its extreme in Pi. Furthermore, a level Li is important if Gi or li is 
important. Let / be the smallest number such that | Ui</ Gi\ > 2k 2 and r the largest number such 
that I Ui>r Ii\ > 2k 2. A level Li is outlying ii i < f or i > r. All other levels of the decomposition are 
regular and a vertex is regular, outlying or important depending on the type of the level it is contained 
in. 

Lemma 4.10. Let G be a graph and X a threshold-modulator of G of size at most Ak. Then every 
threshold partition of G — X has at most 16k + 4 important levels. 

Proof. The result follows immediately from the definition of important levels and Lemma 4.7. □ 

Lemma 4.11. Let G be a graph, X a threshold-modulator ofG and (C,I) a threshold partition ofG — X, 
then for every set Y Q X there are at most two important clique fragments (independent fragments) 
realizing Y. 

Proof. We first prove the statement for clique fragments. Let y be a subset of X and i < j < k three 
integers. Assume for a contradiction that Gi,Gj and Gk are important clique fragments all realizing Y. 
By definition there are vertices u £ Gi, v £ Gj and w £ Gk such that Nx{u) = Nx{v) = Nx{w) = Y. 
Furthermore, there is a vertex v' £ Gj such that Nx{v') Y since Gj is important and Y does not have an 
extreme in Gj. By the definition of threshold partitions, we have that Na-x{w) C Ng-x{v') C Ng-x{u). 
Lemma 4.6 immediately implies that N{w) C X[r)'] and N{v’) C X[u] and since {u,v\w} C UC it holds 
that N[u] C 7V[n'] C N[w]. Since Nx{v') ^ Y, we have Nxiw) C Nx{v’) C Nx{u), which contradicts 
the definition of w and u since Nx(u) = Nx(w). By a symmetric argument, the statement also holds 
for independent fragments. □ 

Lemma 4.12. Let G he a graph, X a threshold-modulator of G of size at most 4fc and {C,I) a threshold 
partition of G — X. Then there are at most 64fc^ + 80fc + 16 important vertices in G — X. 

Proof. Let Y be the set of all vertices contained in a important clique or independent fragment and let Z 
be the set of all important vertices. Observe that Y C Z and that every Ci or A contained in Z\Y is a 
twin class in G by definition. By Lemma 4.10 there are at most 16fc + 4 important levels and since the 
twin-rule has been applied exhaustively it holds that |y \ y | < (16A: -|- 4)(2fc -\-2) = 32k^ -\- AOk -\- 8. 

Let A be a subset of X and B the vertices in Y such that their neighborhood in X is exactly A. Let D 
be a Gi or A contained in Y and observe that D 0 B is a twin class in G and hence \D D B\ < 2k -\- 2. 
And hence it follows from Lemma 4.11 that |i3| < 8fc -I- 8 . Furthermore, we know from Lemma 4.7 that 
there are at most 4fc -|- 1 realized in X and hence |y| < {8k -\- 8)(4/c -|- 1) = 32fc^ -|- 40fc -|- 8. It follows 
immediately that \Z\ < 6Ak^ -\- 80k -\- 16, completing the proof. □ 

Lemma 4.13. Let G be a graph, X a threshold-modulator of G of size at most 4fc and {C,X) a threshold 
partition of G — X. Then there are at most 80k^ -\- 112k -\- 32 important and outlying vertices in total in 
G-X. 

Proof. By Lemma 4.12 it follows that there are at most 64fc^ -I- 80fc -|- 16 vertices that are important and 
possibly outlying. It follows from Lemma 4.8 that if a level is not important its vertices are covered by 
at most two twin classes in G and hence the level contains at most Ak -\- A vertices. By definition there 
are at most 4fc -|- 4 outlying levels and hence at most {Ak A- 4)(4A: -|- 4) = 16fc^ -I- 32k -\- 16 vertices which 
are outlying, but not important. The result follows immediately. □ 
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Lemma 4.14. Let G he a graph, X a threshold-modulator of G, v a regular vertex in some threshold 
partition {C,I) of G — X, G = UC and I = Ul. Then for every F C [^(G)]^ such that GAF is a 
threshold graph, |i^| < k and every split partition (Cf,If) of GAF we have: 

• V G C if and only if v G Gf and 

• V G I if and only if v G If- 

Proof. Observe that the two statements are equivalent and that it is sufficient to prove the forward 
direction of both statements. First, we prove that v G G implies that v G Gf- Let Y be the set of 
outlying vertices in J 0 Ng{v) and recall that |y| > 2A: + 1 by definition. Observe that at most 2k 
vertices in Y are incident to F and hence there are two vertices u,u' in Y that are untouched by F. 
Clearly, u and u' are not adjacent in GAF and hence we can assume without loss of generality that u 
is in If- Since u is untouched by F, v is adjacent to u by the definition of outlying vertices and hence v 
is not in If- A symmetric argument gives that v G I implies that v G If and hence our argument is 
complete. □ 

4.1.5 The Irrelevant Vertex Rule 

We have now obtained the structure necessary to give our irrelevant vertex rule. But before stating the 
rule, we need to define these consecutive levels with similar neighborhood and what it means for a vertex 
to be in the middle of such a collection of levels. 

Definition 4.15 (Large strips, central vertices). Let G be a graph, X a threshold-modulator and (C,I) 
a threshold partition of G — X. A strip is a maximal set of consecutive levels which are all regular and 
we say that a strip is large if it contains at least 16A: -|-13 vertices. For a strip S = {[Ca, A], - - -, [G&, h]) 
a vertex v G Gi is central ii a <i <h and | G_,j > 2fc -|- 2 and | G^j > 2fc -I- 2. Similarly 

we say that a vertex v G h is central ii a < i < b and | Ujg[a,i_i] Ij \ > 2k 2 and | Ij\ > 2k2. 

Furthermore, we say that a vertex v is central in G if there exists a threshold-modulator X of size at 
most Ak and a threshold decomposition of G — X such that v is central in a large strip. 

Lemma 4.16. If a strip is large it has a central vertex. 

Proof. Let S = ([Ga, /a],. ■., [Gf,, If,]) be a large strip. First, we consider the case when | Gi\ > 

I Uig[a,f,] Ii\- Observe that | Uig[a_t,] Gi| > 8fc -I- 7. Let i be the smallest number such that | G^j > 

2k 2. It follows immediately from |Gi_i| < 2fc -|- 2 that | Ujg[ay_i] G^j < 4fc -|- 3. Furthermore, since 
\Gi\ <2k-\-2ii follows that | Ujg[j+i^f,] G^j > 8fc -I- 7 — {2k -|- 2 -|- 4A -I- 3) = 2A: -|- 2. And hence any vertex 
in Gi is central. A symmetric argument for the case | f,] G^j < | f,] Aj completes the proof. □ 

Rule 2 (Irrelevant vertex rule). If {G,k) be an instance of Threshold Completion or Threshold 
Editing and v is a central vertex in G, reduce to (G — v,k). 

Lemma 4.17. Let {G,k) be an instance, X a threshold-modulator and v a central vertex in G. Then 
(G, k) is a yes-instance of Threshold Editing ('Threshold Completion ) if and only if (G — v, k) 
is a yes-instance. 

Proof. For readability we only consider Threshold Editing, however the exact same proof works for 
Threshold Completion. For the forwards direction, for any vertex v, if (G, k) is a yes-instance, then 
(G — V, k) is also a yes-instance. This holds since threshold graphs are hereditary. 

For the reverse direction, let (G — v, k) be a yes-instance and assume for a contradiction that (G, k) 
is a no-instance. Let F be a solution of (G — v, k) satisfying Lemma 2.5, and let G' = GAF. By 
assumption, (G, k) is a no-instance, so specifically, G' is not a threshold graph. Let W be an obstruction 
in G'. Clearly v G W since otherwise there is an obstruction in (G — v)AF, so consider Z = V{W) — v. 
For convenience we will use X' to denote neighborhoods in G' and specifically for any set Y C V(G'), 
Ny{v) = Ng'{v) nY. Furthermore, let (C,X) be a threshold decomposition of G — X such that there is a 
large strip S for which v is central. We will now consider the case when v is in the clique of G — X. Since 
|F| < fc and S' is a large strip it follows immediately that there are two clique vertices w and w' in S 
in higher levels than v that is not incident to F. Observe that {w,w',z;} forms a triangle and that W 
contains no such subgraph. Hence, we can assume without loss of generality that w ^ V{W). Similarly, 
we obtain a clique vertex u in a lower level than u in S such that u ^ W. 
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Figure 8: The vertex v was a center vertex in a strip and W = {u,a, &, y} was assumed to be an 
obstruction. 

Observe that G'[ZU{u}] is not an obstruction and hence Nz{u) = N'^{u) ^ N'^{v) = Nz{v). Since u 
and V are clique vertices from the same strip it is true that Nx{v) = Nx{u) and hence there is an 
independent vertex a in Z such that lev(u) < lev(a) < lev(u) (see Definition 2.3). In other words u 
is adjacent to a while v and w are not. By a symmetric argument we obtain a vertex b such that 
lev(w) < lev(&) < lev(ri;), meaning that both u and v are adjacent to b while w is not. Let y be last 
vertex of Z, meaning that {v,y,a,b} = V(W). Observe that a and b are regular vertices and hence it 
follows from Lemma 4.14 that for every threshold partition of G' it holds that {a,b} are independent 
vertices. 

Recall that u, v, w, a, b are all regular and hence they are in the same partitions in G' as in G — X 
by Lemma 4.14. Furthermore, since W is an obstruction and a is neither adjacent to v nor b in G' it 
holds that y and a are adjacent in G'. It follows that y is a clique vertex in G' and hence it is adjacent 
to both u and w in G'. Since u and w are not incident to F by definition, they are adjacent to y also 
in G. Since u,v,w are regular and from the same strip it follows that v is adjacent to y in both G and 
G'. Observe that the only possible adjacency not yet decided in W is the one between b and y. However, 
for W to be an obstruction it should not be present. Hence y is adjacent to a but not to 6 in G'. By 
definition Nc{a) C Ncib), however by the last observation this is not true in G'. This contradicts that F 
satisfies Lemma 2.5. A symmetric argument gives a contradiction for the case when v is an independent 
vertex and hence the proof is complete. □ 


The above lemma shows the soundness of the irrelevant vertex rule, Rule 2, and we may therefor 
apply it exhaustively. The following theorem wraps up the goal of this section. 

Theorem 6. The following three problems admit kernels with at most 336A:^+388fc+92 vertices: Thresh¬ 
old Deletion, Threshold Completion and Threshold Editing. 

Proof. Assume that Rules 1 and 2 have been applied exhaustively. If this process does not produce a 
threshold-modulator, we can safely output a trivial no-instance by Lemma 4.5. Hence, we can assume 
that we have a threshold-modulator X of size at most 4fc and that the reduction rules cannot be applied. 
By Lemma 4.13 we know that there are at most 80A:^ -I- 112fc -|- 32 vertices in G — A that are not regular. 
Furthermore, every regular vertex is contained in a strip and by Lemma 4.10 there are at most 16A: -I- 5 
such strips. Since the reduction rules cannot be applied, no strip is large, and hence they contain at 
most 16fc -|- 12 vertices each. Since every vertex in G is either in A, or considered regular, outlying or 
important this gives us Ak + 80fc^ -I- 112fc -1-32-1- (16fc -I- 5)(16fc -I- 12) = 336fc^ -I- 388A: -I- 92 vertices in 
total. □ 
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4.2 Adapting the Kernel to Modification to Chain Graphs 

In this section we provide kernels with quadratically many vertices for Chain Deletion, Chain Com¬ 
pletion and Chain Editing. Due to the fundamental similarities between modification to chain and 
threshold graphs we omit the full proof and instead highlight the differences between the two proofs. 
Observe that the only proofs for the threshold kernels that explicitly applies the obstructions are those 
of Lemmata 4.5, 4.6 and 4.17 and hence these will receive most of our attention. 

The twin reduction rule goes through immediately and hence our first obstacle is the modulator. 
Luckily, this is a minor one. Recall from Definition 4.1 that the obstructions now are Ti, — {‘ 2 ,K 2 ^ C 3 , C 5 }; 
We thus get a chain-modulator X of size 5k, as the largest obstruction contains five vertices. Besides 
this detail, the proof goes through exactly as it is. 



Hi H2 Hs Hi H5 


Figure 9: Some of the intersections of an obstruction with a chain-modulator X that by definition will 
not occur. Dashed edges represent edges that could or could not be there. These are the intersections 
necessary for the proof of the kernel. 


4.2.1 An Additional Step 

Before we continue with the remainder of the proof we need an additional step. Namely to discard all 
vertices that are isolated in G — A. We will prove that by doing this we discard at most 0{k^) vertices. 
Now, if the irrelevant vertex rule concludes that the graph is small, then the graph is small also when we 
reintroduce the discarded vertices. And if we find an irrelevant vertex, we remove it and reintroduce the 
discarded vertices before we once again apply our reduction rules. Due to the locality of our arguments, 
this is a valid approach. 

Lemma 4.18. For a graph G and a corresponding chain-modulator X there are at most lO/c^ -I- 12k 2 

isolated vertices in G — X. 

Proof. Let I be the set of isolated vertices in G — A. We will prove that A = {Nx{v) | u S /} is laminar 
(see Definition 2.8) and hence by Lemma 2.9 it holds that |A| < |A| -|- 1 < 5fc-|- 1. It follows immediately, 
due to the twin reduction rule, that there are at most {5k l){2k 2) = IQk^ 12k 2 independent 

vertices in G — A. 

Assume for a contradiction that there are vertices u, v and w in J such that there exists u' G Nx{u) \ 
Nx{v) and v' G Nx{v) \ Nx{u) with {u',v'} C Nx{w). These vertices intersect with the modulator as 
a variant of the forbidden H^ in Figure 9 and hence we get a contradiction. □ 

4.2.2 Nested Neighborhoods 

From now on we will assume in all of our arguments that there are no isolated vertices in G — A. The 
next difference is with respect to Lemma 4.6, which is just not true anymore. The lemma provided us 
with the nested structure of the neighborhoods in the modulator and was crucial for most of the proofs. 
As harmful as this appears to be at first, it turns out that we can prove a weaker version that is sufficient 
for our needs. 

Lemma 4.19 (New, weaker version of Lemma 4.6). Let G he a graph and X a chain-modulator. For 
every pair of vertices u and v in the same bipartition of G — X it holds that either N{u) C N{v) or 
Niv) C N{u). 

Proof. Let u and v be two vertices from the same bipartition of G — A. By the definition of chain graphs 
we can assume that Ng-x{u) Q Nc-xiv). Assume for a contradiction that the lemma is not true. Then 
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there is a vertex u' G Nx{u) \ Nx{v) and a vertex v' in Nx^v) \ Nx(u). By definition, u and v are 
not adjacent. Since there are no isolated vertices in G — X there is a vertex a G Ng-x{u) C Ng-x{v). 
Observe that if a is adjacent to either u' or v' we get a C 3 that only has one vertex in X, which is a 
contradiction (see Hi in Figure 9). However, if a is not adjacent to both u' and v' then {u,v,u',v',a} 
forms the same interaction with the modulator as H 4 in Figure 9 and hence our proof is complete. □ 

One can observe that Lemma 4.19 is a sufficiently strong replacement for Lemma 4.6 since all proofs 
are applying the lemma to vertices from only one partition of G — X. The only exception is the proof 
of Lemma 4.7, but by applying Lemma 4.19 on one partition at the time we obtain the following bound 
instead: 

|{iVx(n) for V G F(G) \ X}\ < 2|X| + 2. 

4.2.3 An Irrelevant Vertex Rule 

It only remains to prove that the irrelevant vertex rule can still be applied with this new set of obstructions. 
Although the strategy is the same, the details are different and hence we provide the proof in full detail. 

Lemma 4.20. Let (G, k) be an instance, X a threshold-modulator and v a central vertex in G. Then 
(G, k) is a yes-instance of Chain Editing ( Chain Completion ) if and only if (G — v, k) is a yes- 
instance. 

Proof. For readability we only consider Chain Editing, however the exact same proof works for Chain 
Completion. For the forwards direction, for any vertex v, if (G, k) is a yes-instance, then (G — v, k) is 
also a yes-instance. This holds since chain graphs are hereditary. 

For the reverse direction, let (G — v, k) be a yes-instance and assume for a contradiction that (G, k) is 
a no-instance. Let F be a solution of {G—v, k) satisfying Lemma 2.5, and let G' = GAF. By assumption, 
{G,k) is a no-instance, so specifically, G' is not a chain graph. Let W be an obstruction in G'. Clearly 
V G W, since otherwise there is an obstruction in (G — v)AF. Let Z = V{W) — v. For convenience we 
will use N' to denote neighborhoods in G' and specifically for any set Y C V(G'), Nyfu) = Ng'{v) fl Y. 
Furthermore, let [A, B) be a chain decomposition of G — A such that there is a large strip S for which v 
is central. Let A = UA and B = \JB. We will now consider the case when v is in A. Since |F| < k and S 
is a large strip it follows immediately that there are two vertices w and w' in Arils' in higher levels than v 
that is not incident to F. Observe that {w,w',v} forms an independent set of size three and that W 
contains no such subgraph. Hence, we can assume without loss of generality that w ^ V{W). Similarly, 
we obtain a vertex m in A at a lower level than i; in S' such that u . 

Observe that G'[Z\j{u}] is not an obstruction and hence Nz{u) = N'^{u) ^ N'^{v) = Nz{v). Since u 
and V are vertices in A from the same strip it is true that Nx (v) = Nx (u) and hence there is a vertex a 
in Z n B such that lev(M) < lev(a) < lev(i;). In other words u is adjacent to a, while v and w are not. 
By a symmetric argument we obtain a vertex b such that lev(i;) < lev(6) < lev(i(;), meaning that both u 
and V are adjacent to b while w is not. We now fix a chain decomposition and let A' = UA' 

and B' = UB'. Observe that a and b are regular vertices and hence it follows from the chain version 
of Lemma 4.14 that {a, b} is in B'. This yields immediately that W is not a G 3 (since a and b are not 
adjacent) and hence we are left the cases of W being a 2 K 2 or a G 5 . 

We now consider the case when W is isomorphic to a 2 K 2 . Let y be the last vertex of Z, meaning 
that {v,y,a,b} = V(W). Observe that since IF is a 2 K 2 it holds that y is adjacent to a, but not 
to b. However, in G it holds that N{a) C N{b) and hence F is not satisfying Lemma 2.5, which is a 
contradiction. 

Hence we are left with the case that W is isomorphic to a G 5 . Let y,x be the last vertices of Z. 
Observe that all vertices in W should be of degree two and hence a is adjacent to both x and y. Recall 
that a is in B' and observe that u is in A' by the same reasoning. Due to their adjacency to a, also x 
and y is in A'. It follows immediately that u,x and y form an independent set in (G — v)AF. Since u 
and V are not touched by F and in the same strip it follows that v, x and y form an independent set 
in G'. We observe that by this W can not be isomorphic to a G 5 . The argument for the case when v G B 
is symmetrical and hence the proof is complete. □ 

We immediately obtain our kernelization results for modifications into chain graphs by the same wrap 
up as for threshold graphs. 
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Theorem 7. The following three problems admit kernels with at most 0(kf ) vertices: Chain Deletion, 
Chain Completion and Chain Editing. 

5 Subexponential Time Algorithms 

5.1 Threshold Editing in Snbexponential Time 

In this section we give a subexponential time algorithm for Threshold Editing. We also show that 
we can modify the algorithm to work with Chain Editing. Combined with the results of Fomin and 
Villanger [14] and Drange et al. [10], we now have complete information on the subexponentiality of edge 
modification to threshold and chain graphs. In this section we aim to prove the following theorem: 

Theorem 8. Threshold Editing admits a + poly(n) subexponential time algorithm. 

The additive poly(n) factor comes from the kernelization procedure of Section 4. The remainder of 
the algorithm operates on the kernel, and thus has running time that only depends on k. 

We will throughout refer to a solution F. In this case, we are assuming a given input instance (G, k), 
and then F is a set of at most k edges such that GAF is a threshold graph. In the next section, 
Section 5.2, we will assume GAF to be a chain graph. Furthermore, after Section 5.1.1, we will be 
working with the problem Split Threshold Editing, so we assume F C C x I when (G, I) is the split 
partition of G. 

Definition 5.1 (Potential split partition). Given a graph G and an integer k (called the budget), for G 
and / a partitioning of V (G) we call (G, I) a potential split partition of G provided that 

{\^^'^-E[G) + E{I)<k. 

That is, the cost of making G into a split graph with the prescribed partitioning does not exceed the 
budget. 

A brief explanation of the algorithm for Theorem 8. The algorithm consists of four parts, the 
first of which is the kernelization algorithm described in Section 4. This gives in polynomial time an 
equivalent instance (G, k) with the guarantee that |E(G)1 = 0{k‘^). We may observe that this is a proper 
kernel, i.e., the reduced instance’s parameter is bounded by the original parameter. This allows us to 
use time subexponential in the kernelized parameter. 

The second step in the algorithm selects a potential split partitioning of G. We show that the 
number of such partitionings is bounded subexponentially in k, and that we can enumerate them all in 
subexponential time. This step actually also immediately implies that editing^, completing and deleting 
to split graphs can be solved in subexponential time, however all of this was known [18, 15]. The main 
part of this step is Lemma 5.3. For the remainder of the algorithm, we may thus assume that the input 
instance is a split graph, and that the split partition needs to be preserved, that is, we focus on solving 
Split Threshold Editing. 

The third and fourth steps of the algorithm consists of repeatedly finding special kind of separators 
and solving structured parts individually; Step three consists of locating so-called cheap vertices (see 
Definition 5.6 for a formal explanation). These are vertices, v, whose neighborhood is almost correct, in 
the sense that there is an optimal solution in which v is incident to only 0{\/k) edges. The dichotomy 
of cheap and expensive vertices gives us some tools for decomposing the graph. Specific configurations 
of cheap vertices allow us to extract three parts, one part is a highly structured part, the second part is 
a provably small part which me may brute force, and the last part we solve recursively. All of which is 
done in subexponential time 

Henceforth we will have in mind a “target graph” FI = GAF with threshold partitioning (C,I). We 
refer to the set of edges F as the solution, and assume \F\ < k. 

^Indeed, editing to split graphs is solvable in linear time [18]. 
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5.1.1 Getting the Partition 

As explained above, a crucial part of the algorithm is to enumerate all sets of size at most 0{'/k). The 
following lemma shows that this is indeed doable and we will use the result of this lemma throughout 
this section without necessarily referring to it. 

Lemma 5.2. For every c G N there is an algorithm that, given an input instance {G,k) with |P(G)| = 
enumerates all vertex subsets of size c Vk in time 

Proof. Given an input graph G = (V, E), with \V\ = n = k^^^^ and a natural number k we can simply 
output the family of sets A C 2^ of size at most cy/k, which takes time 



^c\/fc _ 2^(v^logn) _ ^0{Vklogk) 


where the first inequality follows since (") is increasing for i from 1 to cVk. 


□ 


The second step of the subexponential time algorithm was as described above to compute the potential 
split partitionings of the input instance. Since we are given a general graph, we do not know immediately 
which vertices will go to the clique partition and which will go to the independent set partition. However, 
we now show that there is at most subexponentially many potential split partitionings. That is, there 
are subexponentially many partitionings of the vertex set into {G, I) such that it is possible to edit the 
input graph to a threshold graph with the given partitioning not exceeding the prescribed budget. 

The next lemma will be crucial in our algorithm, as our algorithm presupposes a hxed split partition. 
Using this result, we may in subexponential time compute every possible split partition within range, 
and run our algorithm for completion to threshold graphs on each of these split graphs. 

Lemma 5.3 (Few split partitions). There is an algorithm that given a graph G and an integer k with 
|U(G)| = can generate a set V of split partitions of V (G) such that for every split graph H such 

that \E(H)AE(G)\ < k and every split partition {C,I) of E[ it holds that {G,I) is an element of V. 
Furthermore, the algorithm terminates in time. 

Proof. Let G = (U, E) be a graph and k a natural number. The first thing we do is to guess the size Sc 
of the clique and let G be a set of Sc vertices of highest degrees, and Si = n — s, and let I = V{G) \ G. 
In the case that min{sc, Si} < 6 \/fc we can simply enumerate every partitioning by Lemma 5.2, so we 
assume from now on that min{sc,Si} > &Vk. 

Claim 5.4. In any split graph H with \E{H)AE(G)\ < k, where H has split partition G',I' with 
\G'\ = Sc, IGAG'I < 2Vk and |/A/'| < 2 Vl. 

Proof. Suppose that 2-\/fc vertices G' move from G to I and that 2^/k vertices I' move from I to G. 
Let CTc = ~ First, since the vertices are ordered by degree, ai < Uc. 

Second, since in the final solution, G' is in the independent set, Oc < Sc2'/k + k (we might delete up to k 
vertices from G') and using the same reasoning, ai > (sc — 2\/k) + — k = Sc2\/k — 3k — \/k (we 

might add up to k vertices to I'). 

However, since Sc > GVk, we have 

Sc ■ 2\fk — 3k — Vk < ai < ac < Sc ■ 2Vk + k, and thus 
9k — Vk < ai < ac < 13A:, 

yielding that ac>9k — Vk. However, we can only lower the total degree of G' by 2k, which means that 
even if we spend the entire budget on deleting from G', ^ which means that there is 

a vertex in G' with degree higher than the size of the clique (a contradiction). □ 

Observe that since Sc and Si are fixed, if we move I vertices from G to /, we have to move i vertices 
from I to G. Hence, if the claim holds, we can simply enumerate every set of AVk vertices and take the 
sets with equally many on each side and swap their partition. Adding each such partition to V gives the 
set in question. □ 
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We would like to remark that this lemma also gives a simpler algorithm for Split Completion 
(equivalently Split Deletion). Ghosh et al. [15] showed that Split Completion can be solved in 
time • poly(n) using the framework of Alon, Lokshtanov and Saurabh [Ij. However, the 

following observation immediately yields a very simple combinatorial argument for the existence of such 
an algorithm. Together with the polynomial kernel by Guo [17], the following result is immediate from 
the above lemma. 

Corollary 5.5. The problem Split Completion is solvable in time +poly(n). 

Proof. The algorithm is as follows. On input (G, k) we compute, using Lemma 5.3, every potential 
split partitioning (G, I) at most k edges away from G. Then we in linear time check that I is indeed 
independent and that G lacks at most k edges from being complete. □ 


5.1.2 Cheap or Expensive? 

We will from now on assume that all our input graphs G = (V, E) are split graphs provided with a split 
partition {C,I), and that we are to solve Split Threshold Editing, that is, we have to respect the 
split partitioning. We are allowed to do this with subexponential time overhead, as per the previous 
section and specifically Lemma 5.3. In addition, we assume that 1E(G)1 = 0(/c^). 

Given an instance (G, k) and a solution F, we define the editing number of a vertex v, denoted enQ{v), 
to be the number of edges in F incident to a vertex v. When G and F are clear from the context, we 
will simply write en(t;). A vertex v will be referred to as cheap if en(i;) < 2y/k and expensive otherwise. 
We will call a set of vertices U QV small provided that \U\ < 2\/k and large otherwise. 

Definition 5.6. Given an instance (G, k) with solution F, we call a vertex v cheap if en(u) < 2\fk. 

The following observation will be used extensively. 

Observation 5.7. IfUQ I^(G) is a large set, then there exists a cheap vertex in U, or contrapositively: 
if a set U C V (G) has only expensive vertices, then U is small. Specifically it follows that in any yes 
instance (G, k) where F is a solution, there are at most 2\/k expensive vertices. 

This gives the following win-win situation: If a set X is small, then we can “guess” it, which means 
that we can in subexponential time enumerate all candidates, and otherwise, we can guess a cheap vertex 
inside the set and its “correct” neighborhood. In particular, since the set of expensive vertices is small, 
we can guess it in the beginning. For the remainder of the proof we will assume that the graph G is 
a labeled graph, where some vertices are labeled as cheap and others as expensive. There will never 
be more than 2 '/k vertices labeled expensive, however a vertex labeled expensive might very well not 
be expensive in G and vice versa. The idea is that we guess the expensive vertices at the start of the 
algorithm and then bring this information along when we recurse on subgraphs. 

5.1.3 Splitting Pairs and Unbreakable Segments 

Definition 5.8 (Splitting pair). Let G be a graph, k an integer, F a solution of (G,k) and (C,X) a 
threshold decomposition of GAF. We then say that the vertices u € la and u G Gb is a splitting pair if 

• a < b, 

• u and V are cheap, 

• ^a<i<bLi consists of only expensive vertices. Recall from Definition 2.3 that Li = CiU fi. 

Definition 5.9 (Unbreakable). Let G be a graph, k an integer, F a solution of {G,k) and (C,I) a 
threshold decomposition of GAF. We then say that a sequence of levels {Ca, la), {Ca+i,Ia+i), ■ ■ ■ ,{Cb,Ib) 
is an unbreakable segment if there is no splitting pair in the vertex set Ui^[a,b]iCi U fi). 

Furthermore, we say that an instance (G, k) is unbreakable if there exists an optimal solution F and 
a threshold decomposition (C,I) of GAF such that the entire decomposition is an unbreakable segment. 
We also say that such a decomposition is a witness of G being unbreakable. 
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Definition 5.10. Let G be a graph and (C,I) a threshold decomposition of GAF for some solution F. 
Then we say that f is a transfer level if 

• for every j > i it holds that Cj contains no cheap vertices and 

• for every j < i it holds that Ij contains no cheap vertices. 

Lemma 5.11. Let {G,k) be a yes instance of Split Threshold Editing with solution F such that G 
is unbreakable and (C,I) a witness. Then there is a transfer level in {C,X). 

Proof. Suppose for a contradiction that the lemma is false. Let a be maximal such that Ca contains a 
cheap vertex and b minimum such that If, contains a cheap vertex. Since i = a clearly satisfies the first 
condition, it must be the case that b < a. Increment b as long as & + 1 < a and there is a cheap vertex 
in Uig(b_a)/i. Then decrement a as long as 6 + 1 < a and there is a cheap vertex in Uig(;,^a)Gi. Let u 
be a cheap vertex in Ga and v a cheap vertex in Gb. It follows from the procedure that they both exist. 
Observe that u,v is indeed a splitting pair, which is a contradiction to G being unbreakable and (C,I) 
being a witness. □ 

Lemma 5.12. Let {G,k) be an instance of Split Threshold Editing such that G is unbreakable and 
(C,I) a witness of this. Then the number of levels in (C,I) is at most 2-\/fc + 1. 

Proof. Let i be the transfer level in (C,I). It is guaranteed to exist by Lemma 5.11. Observe that for 
every j > i it holds that Gi consists of expensive vertices and for every j < i it holds that li consists 
of expensive vertices. It follows immediately that every level besides i contains at least one expensive 
vertex. As there are at most 2y/k such vertices the result follows immediately. □ 

Lemma 5.13. Let {G,k) be an instance of Split Threshold Editing such that G is unbreakable, 
(C,X) is a witness of this and F a corresponding solution. If X is the set of cheap vertices in G then 
(GAE)[X] forms a complete split graph. 

Proof. Let t be the transfer level of the decomposition, u a cheap vertex in Gi and v a cheap vertex in Ij 
for some i and j. By the definition of t it holds that i <t < j. It follows immediately that u and v are 
adjacent in GAF and the proof is complete. □ 

We will now describe the algorithm unbreakAlg. It takes as input an instance (G, (G, I), k) of Split 
Threshold Editing, with the assumption that G is unbreakable and has split partition {G,T), and 
returns either an optimal solution F for {G,k) where |E| < fc or correctly concludes that {G,k) is a no¬ 
instance. Assume that (G, k) is a yes-instance. Then there exists an optimal solution F and a threshold 
decomposition (C,I) of GAF that is a witness of G being unbreakable. First, we guess the number of 
levels £ in the decomposition, and by Lemma 5.12, we have that i € [0, 2\/k + 1] and the transfer level 
t € [0,i]. Then we guess where the at most 2-\/fc vertices that are expensive in G are positioned in {C,X). 
Observe that from this information we can obtain all edges between expensive vertices in F. Finally, we 
put every cheap vertex in the level that minimizes the cost of fixing its adjacencies into the expensive 
vertices while respecting that t is the transfer level. From this information we can obtain all adjacencies 
between cheap and expensive vertices in F. Since the cheap vertices induces a complete split graph, we 
reconstructed F and hence we return it. 

Lemma 5.14. Given an instance {G,k) of Split Threshold Editing with G being unbreakable, 
unbreakAlg either gives an optimal solution or correctly concludes that (G, k) is a no-instance in time 

20 {Vk log k) 

Proof. Since the algorithm goes through every possible value for i and t (according to Lemmata 5.11 
and 5.12), and every possible placement of the expensive vertices, the only thing remaining to ensure 
is that the cheap vertices are placed correctly. However, since the cheap vertices form a complete split 
graph (according to Lemma 5.13), the only cost associated with a cheap vertex is the number of expensive 
vertices in the opposite side it is adjacent to. However, their placement is fixed, so we simply greedily 
minimize the cost of the vertex by putting it in a level that minimizes the number of necessary edits. 

If we get a solution from the above procedure, this solution is optimal. On the other hand, if in every 
branch of the algorithm we are forced to edit more than k edges, then either (G, k) is a no-instance, or G 
is not unbreakable. Since the assumption of the algorithm is that G is unbreakable, we conclude that 
the algorithm is correct. □ 
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Figure 10: The partitioning of the vertex sets according to solveAlg. The square bags are the bags con¬ 
taining the splitting pair, U is an unbreakable segment and the bags of X contains exclusively expensive 
vertices. The edges drawn indicates the neighborhoods of the splitting pair across the partitions. 


5.1.4 Divide and Conquer 

We now explain the main algorithm. The algorithm takes as input a graph G, together with a split 
partition (G, I) and a budget k. In addition, it takes a vertex set S which the algorithm is supposed 
to find an optimal solution for. The algorithm is recursive and either finds a splitting pair, in which it 
recurses on a subset of S, and if there is no splitting pair, then G[iS'] is unbreakable, and thus it simply 
runs unbreakAlg on S. To avoid unnecessary recomputations, it uses memoization to solve already 
computed inputs. 

The algorithm solveAlg(G, (G,/), fc, 5) returns an optimal solution for the instance (G[S'],fc), re¬ 
specting the given split partition (G, I) in the following manner: 

(1) Run unbreakAlg(G[S'], (G fl S', / fl S'), fc). 

(2) For every pair of cheap vertices u G I and v G C, together with their correct neighborhoods Nu 
and Ny, and every pair of subsets Cx C G and /x C / of expensive vertices we do the following: Let 
X = IxU Cx, Rc = Nu, [// = W n /, i?/ = J \ (X U Ui) and Gc = S \ (X U i?c U G/ U Rj). Now, 
G = G/ U Gc is the unbreakable segment, X is the set of expensive vertices between the splitting 
pair, and R = Rj U Rc is the remaining vertices. We now 

(a) Run unbreakAlg(G[G], (G fl G, / fl G), k) yielding a solution Fu, 

(b) solve G[X] optimally by brute force since it has size at most giving a solution Fx, and 

(c) recursively call solveAlg(G, (G, J), A, i?) to solve the instance corresponding to the remaining 
vertices yielding Fr. 

Finally we return F, the union of Fu , Fx , and Fu together with all edges from Gfl i? and I fl (X U G), 
and all edges from G fl X to / fl G. 

In (1) we consider the option that there are no splitting pairs in G. In (2) (see Figure 10) we guess 
the uppermost splitting pair in the partition and the neighborhood of these two vertices. Then we guess 
all of the expensive vertices that live in between the two levels of the splitting pair. Observe that these 
expensive vertices together with the splitting pair partition the levels into three consecutive sequences. 
The upper one, G is an unbreakable segment, the middle, X are the expensive vertices and the lower 
one, R is simply the remaining graph. When we apply unbreakAlg on the upper part, brute force the 
middle one and recurse with solveAlg on the lower part, we get individual optimal solutions for each 
three, finally we may merge the solutions and add all the remaining edges (see end of (2)). 

Lemma 5.15. Given a split graph G = (V, E) with split partition (C,I), solveAlg either returns an 
optimal solution for Split Threshold Editing on input {G, {C, I), k,V), or correctly concludes that 
(G, k) is a no-instance. 
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Proof. If {G,k) with split partition {C,I) is a yes instance of Split Threshold Editing there is a 
solution F with threshold decomposition (C,I) and a sequence of pairs (mi, ui), {u 2 , V 2 ), ■ • ■, (ut, Vt) such 
that ui,vi is the splitting pair highest in and U 2 ,V 2 in the highest splitting pair in the graph 

induced by the vertices in and below the level of vi, etc. Since we in a state (G, {C,I),k, S) try every 
possible pair of such cheap vertices and every possible neighborhood and set of expensive vertices, we 
exhaust all possibilities for any threshold editing of S of at most k edges. Hence, if there is a solution, 
an optimal solution is returned. 

Thus, if ever an F is constructed of size |F| > k, we can safely conclude that there is no editing set 
F* C G X / of size at most k such that GAF* is a threshold graph. □ 

Lemma 5.16. Given a split graph G = {V, E) with split partition (G, I) and an integer k with \V (G)| = 
0{k^), the algorithm solveAlg terminates in time on input {G,{C,I),k,V). 

Proof. By charging a set S for which solveAlg is called with input {G, {C, I), k, S) every operation 
except the recursive call, we need to (i) show that there are at most many sets S' C E for 

which solveAlg is called, and (ii) that the work done inside one such call is at most 

For Case (i), we simply note that when solveAlg is called with a set S, the sets R on which we 
recurse are uniquely defined by u,v, Nu, Ny, X, and there are at most 0 {k‘^) ■ = 2 ‘^('/^'°g^) 

such configurations, so at most sets are charged. Case (ii) follows from the fact that we guess 

two vertices, u and v and three sets, Ny, Ny and X. For each choice we run unbreakAlg, which runs 
in time by Lemma 5.14, and the brute force solution takes time . The recursive 

call is charged to a smaller set, and merging the solutions into the final solution we return, F, takes 
polynomial time. 

The two cases show that we charge at most sets with work, and hence solveAlg 

completes after steps. □ 


To conclude we observe that Theorem 8 follows directly from the above exposition. Civen an input 
(G, k) to Threshold Editing, from the previous section we can in polynomial time obtain an equivalent 
instance with at most 0{kf) vertices. Furthermore, by Lemma 5.3 we may in time time 

assume we are solving the problem Split Threshold Editing. Finally, by Lemmata 5.15 and 5.16, 
the theorem follows. 

5.2 Editing to Chain Graphs 

We finally describe which steps are needed to change the algorithm above into an algorithm correctly 
solving Chain Editing in subexponential time. 

The main difference between Chain Editing and Threshold Editing is that it is far from clear 
that the number of bipartitions is subexponential, that is, is there a bipartite equivalent of the bound 
of the potential split partitions as in Lemma 5.3? If we were able to enumerate all such “potential 
bipartitions” in subexponential time, we could simply run a very similar algorithm to the one above on 
the problem Bipartite Chain Editing, where we are asked to respect the bipartition (see Section 3.2.1 
for the definition of this problem). 

It turns out that we indeed are able to enumerate all such potential bipartitions within the allowed 
time: 

Lemma 5.17. There is an algorithm which, given an instance {G,k) for Chain Editing, enumerates 
(o(^)) ~ bipartite graphs FI = {A,B,E') with |FAF'| < k such that if {G,k) is a yes 

instance, then one output [FI, k) will be a yes instance for Bipartite Chain Editing, and furthermore 
is any yes instance [H, k) is output, then (G, k) is a yes instance. This also holds for the deletion and 
completion versions. 

Proof. We first mention that it is trivial to change the below proof into the proofs for the deletion and 
completion versions; One simply disallow one of the operations. So we will prove only the editing version. 
Furthermore, it is clear to see that if any output instance [H, k) is a yes instance for Bipartite Chain 
Editing, then {G,k) was a yes instance for Chain Editing. 
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Consider any solution H = {A,B,E') for an input instance {G,k). If either min{|A|, |_B|} < bVk, 
then we can simply guess every such in subexponential time. Hence, we assume that both sides of H 
are large. But this means, by Observation 5.7, that both A and B have cheap vertices. Let va be a 
cheap vertex as low as possible in A and us be a cheap vertex as high as possible in B. It immediately 
follows from the same observation that the set of vertices below va, Ax is a set of expensive vertices, 
and the same for the vertices above vb, Bx- Since va and vb, we know that we can in subexponential 
time correctly guess their neighborhoods in H and we can similarly guess Ax and Bx- 

Now, since we know va, vb, Nh{va) and Nh{vb), as well as Ax and Bx, the only vertices we do 
now know where to place, are the vertices in A which are in the levels above lev(uB), call them Ay, and 
the vertices in b which are in the levels below lev(u^). However, we know which set this is, that is, we 
know Z = AyUBy. Define now Am = ^ \ {AyVj Ax U {ua}) and similarly Bm = B \ {ByVJBx U {u_b}). 
These are the vertices living in the middle of A and B, respectively. 

We now know that the vertices of Z should form an independent set. This follows from the fact that 
Am and Bm are both non-empty. Hence, the vertices of Ay are in higher levels than all of By , and since 
there are no edges going from a vertex in H to a vertex lower in H, and each of A and B are independent 
sets, Z must be an independent set. 

The following is the crucial last step. We can in subexponential time guess the partitioning of levels 
of both Ax and of Bx, since they are both of sizes at most 2\fk. When knowing these levels, we can 
greedily insert each vertex in Z into either A and B by pointwise minimizing the cost; A vertex z £ Z 
can safely be places in the level of A or i? which minimizes the cost of making it adjacent to only the 
vertices of Bx above its level, or by making it adjacent to only the vertices below its level in Ax- □ 

Given the above lemma, we may work on the more restricted problem. Bipartite Chain Editing. 
The rest of the algorithm actually goes through without any noticeable changes: 

Theorem 9. Chain Editing is solvable in time -|-poly(n). 

Proof. On input (G, k) we first run the kernelization algorithm from Section 4.2, and then we enumerate 
every potential bipartition according to Lemma 5.17. Now, for each bipartition {A, B) we make A into 
a clique, and run the Split Threshold Editing algorithm from Section 5.1 (see also Proposition 2.7). 

Now, (G, k) is a yes instance if and only if there is a bipartition (A, B) such that when making A 
into a clique, the resulting instance is a yes instance for Split Threshold Editing. □ 

Corollary 5.18. Chain Deletion and Chain Completion are solvable in time -l-poly(n). 

6 Conclusion 

In this paper we showed that the problems of editing edges to obtain a threshold graph and editing edges 
to obtain a chain graph are NP-complete. The latter solves a conjecture in the positive from Natanzon 
et al. [27] and both results answer open questions from Sharan [30], Burzyn et al. [4], and Mancini [24]. 

On the positive side, we show that both Threshold Editing and Chain Editing admit quadratic 
kernels, i.e., given a graph (G, fc), we can in polynomial time find an equivalent instance {G',k) where 
1H(G')1 = 0{k'^), and furthermore, G' is an induced subgraph of G. We also show that these results hold 
for the deletion and completion variants as well, and these results answer open questions by Liu et al. in 
a recent survey on kernelization complexity of graph modification problems [20]. 

Finally we show that both problems admit subexponential algorithms of time complexity -|- 

poly(n). This answers a recent open question by Liu et al. [22]. 

In addition, we give a proof for the NP-hardness of Chordal Editing which has been announced 
several places but which the authors have been unable to find. However, our N P-completeness proof for 
Chordal Editing suffers a quadratic blow-up from 3Sat, i.e., k = 0(jy>j^), so we cannot get better than 
2°(^) . poly(n) lower bounds from this technique. The current best algorithm for Chordal Editing^ 
runs in time 2*^1^ • poly(n) [6], and so this leaves a big gap. It would be interesting to see if we can 

achieve tighter lower bounds, e.g., 2°^^^ • poly(n) time lower bounds for Chordal Editing assuming 
ETH together with a 2'^^^^ • poly(n) time algorithm. 


^Here, the authors take Chordal Editing to allow vertex deletions. 
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